Maximal-entropy random walk unifies centrality measures 
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In this paper analogies between different (dis)similarity matrices are derived. These matrices, 
which are connected to path enumeration and random walks, are used in community detection 
methods or in computation of centrality measures for complex networks. The focus is on a number 
of known centrality measures, which inherit the connections established for similarity matrices. 
These measures are based on the principal eigenvector of the adjacency matrix, path enumeration, 
as well as on the stationary state, stochastic matrix or mean first-passage times of a random walk. 
Particular attention is paid to the maximal-entropy random walk, which serves as a very distinct 
alternative to the ordinary random walk used in network analysis. 

The various importance measures, defined both with the use of ordinary random walk and the 
maximal-entropy random walk, are compared numerically on a set of benchmark graphs. It is shown 
that groups of centrality measures defined with the two random walks cluster into two separate 
families. In particular, the group of centralities for the maximal-entropy random walk, connected 
to the eigenvector centrality and path enumeration, is strongly distinct from all the other measures 
and produces largely equivalent results. 
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I. INTRODUCTION 

Graphs represent abstracted relationships between en- 
tities. They form a structure upon which a process may 
take place, which is often formalized into the mathemat- 
ical concept called random walk. Together, graph and 
random walk can constitute a model for citations in sci- 
entific collaboration networks, opinion spread on social 
networks or data transmission on the Internet. Instead 
of these kinds of information transfer, more tangible sub- 
jects may be considered, as molecule movement on phys- 
ical or biological networks. Whatever the exact nature 
of the phenomenon, the natural question arises: which 
entity in the network is the most influential, be it a gene 
or a transcription factor, an overloaded hub, a frequented 
website or a renowned researcher. 

A number of importance (or centrality) measures an- 
swering that question have been invented to study social 
(e.g., [l|; is an extensive resource) or telecommuni- 
cation networks (e.g., HITS [3|, PageRank [4|). A sig- 
nificant portion of ideas defining the measures originate 
from graph theory (the degree of a vertex, enumeration 
of paths or the principal eigenvector of the adjacency ma- 
trix) and the theory of Markov chains (stationary states 
of random walks, their stochastic matrices, and mean 
first-passage times). Likewise, most of these approaches 
have been widely utilized in algorithms of community 
finding j5j. 

In this note, we show that a number of these ideas 
can be formulated in a common framework, they pro- 
duce nearly equivalent results for a given random walk, 
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and additionally, the results are more distinct from other 
methods if they make use of the maximal entropy random 
walk (MERW). 

This random walk (RW) is defined so as to ensure 
equiprobability of all paths of a given length and end- 
points. Other RWs usually do not demand as much, for 
instance equal probabilities of going from a node to any of 
its nearest neighbors are enough to define what is called 
here the generic random walk (GRW), one that is well- 
known and commonly used. It is often the case that ei- 
ther GRW or some biased RWs are better suited for par- 
ticular problems. However, the author believes MERW 
should serve alongside GRW as a null model of random 
processes on networks, for a good reason: it is GRW that 
maximizes entropy locally (entropy of the nearest neigh- 
bor selection), and it is MERW that maximizes entropy 
globally (entropy of the path selection). Since a random 
walk can be seen as an ensemble of possible paths, it is 
the latter that yields the largest entropy for that ensem- 
ble [fl 0]- Thus, in a sense, it is the most random of 
random walks. 

MERW exhibits behaviors that may be of general in- 
terest: its stationary distribution localizes on diluted lat- 
tices , its relaxation to stationary state is extremely 
fast on Cayley trees 0, Hf3 |. and it is very slow be- 
tween two identical connected k-regular regions [ll| . The 
equiprobable paths that MERW produces are the natural 
candidates for an ensemble used in Feynman path inte- 
grals (in models of discrete quantum gravity with curved 
space-time) Q or in the optimal sampling algorithm in 
the path- integral Monte Carlo methods Since en- 

tropy maximization is a global principle, conceptually 
analogical to the least action principle, it was also stud- 
ied in biology and has led to the concept of evolutionary 
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entropy |l3|. The same authors have found the value of 
entropy for a given graph useful in selection of robust 
networks [T3|- Lastly, MERW has begun to be used as a 
tool for analysis of complex networks (l5l - [T9j . 



II. GENERIC AND MAXIMAL-ENTROPY 
RANDOM WALKS 

Let us consider a finite connected undirected graph. 
We define a discrete-time random walk on this graph by 
a stochastic (or transition) matrix P. Its entry Py > is 
the probability that if a random walker stays on a node 
i at a time i, it will step to a node j at time t + 1. 
Any row P,* contains the probabilities of moving to all 
neighbors of i, and since the walker cannot disappear 
from the graph nor be created, they all sum up to unity, 
= ""■ ' ^ s we assume that the walker can only 
move to neighboring nodes, the stochastic matrix can 
have a non-zero entry only if the adjacency matrix of the 
graph has a non-zero entry at the same place. Shortly, 
V?,j : Pij < Ay, where A is the adjacency matrix of 
the graph. Elements of this matrix can take two values: 
Ay = 1 if i and j are neighbors and Ay = otherwise. 
Both P and A are assumed to be time-independent. 

The probability that the random walker stays at a 
given vertex i of the graph at a given time t is encoded in 
the i-th element of the vector Tr{t) T — (7Ti(t), . . . , 7Tjv(t)). 
Thus, the initial distribution of probabilities is 7?(0) T , 
and the distribution after t steps Tr(t) T = ir(t — 1) T P = 
7?(0) T P*, where the stochastic matrix has been multiplied 
t times. 

A quantity of interest, given by a solution of 



= ^P, 



(1) 



is the stationary probability distribution (or stationary 
state), which may be understood as the probability dis- 
tribution after infinite time. We assume it exists 0. 

The ordinary or, as we call it, generic random walk cor- 
responds to the standard random walks used by Einstein, 
Smoluchowski or Polya. It is realized by the following 
stochastic matrix 



11 ~ fa 



(2) 



where fa = Y2j Ay denotes a degree of i-th node. Its sta- 
tionary state is proportional to the degrees and is given 
by 7Ti = fa/ Y]j kj . An i-th row of the matrix contains 
uniform probabilities, each equal to 1/fa, of selecting any 
of the fa neighbors of the node i. Thus, the entropy of 
neighbor selection is maximal. 



1 A stationary state exists if an undirected graph is not bipartite, 
but even for bipartite graphs a semi-stationary state can be de- 
fined by averaging probability distribution over two consecutive 
time steps. 



The other type of RW introduced earlier, maximizes 
the entropy of random trajectories, and hence is called 
here the maximal-entropy random walk. This maximiza- 
tion condition leads to a unique stochastic matrix 



P — 



Ay ipoj 



(3) 



where Ao is the largest eigenvalue of the adjacency matrix 
A, and ipw is the i-th element of the principal eigenvec- 
tor tpQ. Since the adjacency matrix is irreducible, the 
Frobenius-Perron theorem guarantees that all elements 
of this vector are strictly positive, thus the condition 
Pij < Ay is fulfilled. 

MERW has the stationary probability distribution 
given by Shannon-Parry measure [20( 



(4) 



Let us note that this formula allows to interpret ipoi as 
the wave function of the ground state of the operator — A 
and tf>ns as the probability of finding a particle in this 
state [E0|> thus relating MERW to quantum mechanics. 

It is easily seen that the two RWs, ([2]) and ([3]), arc 
identical on fc-regular graphs. This should be considered 
an exception, as in general their properties are entirely 
distinct and contrasting. 



III. RELATIONS BETWEEN THE 
STOCHASTIC MATRIX, ITS DISTANCE 
MATRIX, MEAN FIRST-PASSAGE TIME 
MATRIX, AND THE RESOLVENT OF 
ADJACENCY MATRIX 

In general, a stochastic matrix may be not symmetric, 
and so it may have different right and left eigenvectors: 

P^ Q = A a $ a , $^P = A Q C . (5) 
which results in a spectral decomposition 

P = J2^Jl- (6) 

a 

Let us consider a class of Markov processes whose 
stochastic matrix can be transformed into a symmetric 
matrix 



S = D -l/2p D l/2 



(7) 



where we define D = diag^)" 1 , which denotes the ma- 
trix with diagonal entries equal to the inverses of the 
stationary state vector's elements. It follows that 



(8) 



This relation does not hold in general but is clearly ob- 
tained for GRW and MERW, which satisfy (O, as shown 
below. 
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For GRW, stochastic matrix (0 can be written as 

P = diag(fci,fc 2 ,...,fcAr) _1 A , (9) 

where the adjacency matrix can be decomposed into 

A = 2_, a ^a^a^a ■ Substitution of the stationary 
state of GRW into the diagonal matrix yields D = 
diag(fci, fc 2 , • ■ • , fcjv) -1 J2j kj , hence 

P = V \ a T>M a T , (10) 

and thus the eigenvectors are given by 

<3> a = D^ Q ,$ Q = rf a , (11) 

which are related as given in ([5]). 

Similarly, MERW allows for expression of all the eigen- 
values and eigenvectors of the stochastic matrix P in 
terms of eigenvalues A Q and eigenvectors of ip a of the 
adjacency matrix A 

A Q = ^ , * a = D^Va , $a = D-V 2 ^, , (12) 

where D = diag(^Q 1 , ipQ 2 , ■ ■ ■ , V'oiv) ■ ^ n particular, 
A = 1,^08 = 1) an d §oi — Voi = ''"Oi for all i. The 
spectral decomposition of P then reads 

Pij = Y\ K&a&aj = Y\ ^Maj^ 1 ■ (13) 

Thus, clearly all properties of MERW are encoded in the 
spectral decomposition of the adjacency matrix of a given 
graph; it allows for an easier derivation of, for exam- 
ple, the stationary state an d dy namical characteristics of 
MERW for Cayley trees (iEflUJl. 

Since methods of both assessing centrality [Hf and 
finding communities [H, H3| have utilized calculating 
powers of the stochastic matrix, these spectral decompo- 
sitions allow to make further observations. The distance 
matrix used by Latapy and Pons [24j used 

r(% / E.[(P%-P") Z, (14) 

where P and 7? were meant to correspond to GRW, and 
the division by TTk was supposed to reduce the effect of a 
vertex's centrality. It is mentioned that r 2 , the entrywise 
square of this distance matrix, is equivalent to 

N-l 

r 2 (% = £A 2t (* m -vI, QJ ) 2 , (15) 

a=l 

based on spectral decomposition of P ([5]). 

In the case of MERW and GRW (generally, for any 
RW for which S defined in (0 is symmetric), the spectral 
decomposition (fTB")) leads to the compact form 

r 2 (t) = D[(P 2 %E-(P 2 *) T ] + [E(P 2 %-P 2t ]D , (16) 



where (P 2 *)^ is a matrix with (P 2 *)„ on the diagonal 
and zeros otherwise. This is a new formula, which how- 
ever very much resembles a symmetrized version of a 
quantity known as mean first-passage time matrix. 

Mean first-passage time (MFPT) matrix M is a useful 
concept for studying RWs. Its elements Mj/ encode the 
average time to reach the final vertex / from the initial 
vertex i for the first time. We invoke a neat construction 
of the matrix given by Kemeny and Snell [25|, [2(|: first, 
we define the fundamental matrix 

Z= (l-P + e^)" 1 , (17) 

where 1 is the identity matrix, and e = (1, 1, 1) T . The 
MFPT matrix is then given by 

M = (EZ dg - Z)D , (18) 

where E is a matrix of all ones, Z^ g is a diagonal matrix 
with elements (Zdg)a = Ziii and D was introduced in 
0. 

The fundamental matrix is defined so as to contain all 
the powers of the stochastic matrix P, which follows from 
expansion of (1 — P) _1 in a series 1 + P + P 2 + . . .. How- 
ever, matrix 1 — P is non-invertible and consequently the 
expansion does not exist. The correction e*7? T allows for 
a well-defined inversion. In fact, instead of the funda- 
mental matrix one may use other so called generalized 
inverses (the formalism is summarized in 27]), although 
we use (|17p for its conceptual and computational simplic- 
ity. 

In fact, also in (|Tl)l we may take Y^tLo P* instead of P* 
to account for all the powers of the stochastic matrix, just 
as above in the case of the MFPT matrix. This infinite 
sum 

oo 
t=0 

reproduces the path-integral (MERW) and field- 
theoretical (GRW) propagator G of a free relativistic 
particle, as has been shown in [7], which supports the 
view that the stationary probability Q is reminiscent of 
the square of a wave function. 

Nevertheless, the matrix G needs an elaboration. Its 
elementary definition, more general than in (|19[) . is 
specifically given by the number of paths between any 
two nodes, which is just powers of the adjacency matrix 
A 4 with all path lengths t taken into account. As the 
number of paths dramatically grows with their length, 
however, a normalizing parameter > Ao has to be in- 
troduced for the sum of paths to converge: 

oo 

G(/z) = ^V^A* . (20) 
t=o 

From the point of view of paths' statistics, G(/i) de- 
fines the grand-canonical ensemble of paths. An element 



A Centrality based on paths 
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Gfi(p) corresponds to the grand canonical partition func- 
tion, fi to the chemical potential, and the average path 
length is (t) fi = -(lnG)^(M). 

In equations (|19p and ([22]) . G is an abbreviated nota- 
tion of G(^o), where the special choice of /io = — In A is 
related to the graph structure by the largest eigenvalue 
of the adjacency matrix. 

While the arbitrary choice of /j, is equivalent to a cut- 
off of a path length of the order T = 1/ A/i, where A/i = 
fi — /-to , the limit /i — > /io leads to dominance of infinite 
paths, and to a singularity of G(/i). However, /i = — In A 
very close to /io can be taken, yielding 

00 A t A 

G (A')=E TT= A (A1 ~ Arl ' (21) 

t=o 

which is the resolvent of the adjacency matrix. To define 
G(/i) exactly at the singularity \x = /io, the matrix Aol — 
A has to be projected to the subspace perpendicular to 
tjjo before inversion. It can be done similarly as in (|17p 

by taking ^Aol — A + Ao'0o'0o 1 j which eliminates the 

zcroth cigenmode. This is expected and advantageous in 
community finding methods, as discussed in [28j . 
Finally, we obtain 

r 2 = D(G 2 ) dg E - 2VDG 2 VD + E(G 2 ) dg D , (22) 

with G functioning as an analogue of the fundamental 
matrix Z, and where the time dependence has been elim- 
inated. Thanks to the symmetry of the matrix, however, 
the singularity of G cancels out even without the pro- 
jection discussed in the paragraph above, in contrast to 
the definition of MFPT with the use of the fundamental 
matrix. 



of paths, a random walk, and the graph structure at the 
same time. In the limit /i — > /io, whose nuances are 
explained at the end of the previous section, the propor- 
tionality constant diverges, and the contribution of other 
eigenvectors to the centrality is negligible. In Sec. [V] 
instead of restricting /i, finite sums are taken, with the 
maximal length of the enumerated paths equal to the di- 
ameter of a given graph. The elements of I are squared 
so that they correspond to the stationary probability of 
MERW. 

The path weights exponential with respect to the 
path's length were also employed in (22|, although sev- 
eral restrictions on the paths' sets were proposed, e.g., 
only the shortest paths, K-short paths or K-short vertex- 
disjoint paths. For comparison in Sec. |V]we use only the 
shortest paths without any constraints on their length. 

Alternatively, instead of e -M ', factorial weights /3*/t! 
might be introduced [3l|, [HJ (these papers deal primarily 
with the problem of community finding), where /3 is a 
temperature-like parameter tuning the length of paths 
one wants to account for. This yields another quantity 
that may be considered a similarity matrix: the heat 
kernel K(—(3) — (e^ A ), related to heat diffusion or a 
continuum-time quantum walk, and can be understood 
as the Green's function of a network of springs. 

It is, however, the former weight choice (|2"Tj) that gener- 
ates the unique maximally-entropic random walk. Those 
weights make MERW directly reflect the structure of the 
graph, which is explicit in the transition matrix defini- 
tion ([3]) , or conversely, appropriately weighted paths gain 
the interpretation of a random walk. 

B. Centrality based on powers of the transition 
matrix 



IV. CENTRALITY MEASURES 

The above considerations constitute a common frame- 
work for a number of centrality measures. Below, the 
connections between them are reviewed and established. 



A. Centrality based on paths 

The original concept of counting paths to assess cen- 
trality was introduced in 1953 [2^. The idea is to count 
all the paths that lead to a vertex whose importance we 
measure [3(3] , where the number of paths of length t be- 
tween vertices i and / of a graph is given by the element 
(A t )fi of the t-th power of the adjacency matrix. This 
corresponds exactly to the definition shown in (I2U1) . 

The importance of the final vertex / is then given by 
the element If of the vector I = (G(/i) — l)e cx ipo, where 
the uniform vector e was chosen as a set of initial weights, 
and the proportionality to the principal eigenvector holds 
near /j,q. This special choice thus connects the statistics 



Equation (1191) shows that path enumeration is equiv- 
alent to the propagation of MERW. Let us note, how- 
ever, that the walks are also weighted by the ratio 
■00 / /ipoi — yTtjfrti of stationary probabilities of the two 
vertices. It is a reasonable intuition that the importance 
of a random walk trajectory depends on the importance 
of the initial and final vertices. It seems that the problem 
of calculating centrality by employing the transition ma- 
trix becomes self-consistent (importance calculated from 
paths, whose weights depend on the importance), thus, 
gets rid of arbitrariness. 

The method of assessing centrality by summing con- 
secutive powers of the transition matrix is stated in [22j : 

I^f^OfP', (23) 
i=l 

where for simplicity we choose uniform initial probability 
distribution 7?(0) T . Intuitively, the influence of the initial 
vertex on its surroundings is estimated with T steps of a 
random walk. This parameter controls whether local ef- 
fects or the stationary state is favored, with I approach- 
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ing the stationary probability distribution for large T. 
The number of steps is usually kept rather small, due to 
computation costs. 

As noted in the previous section, for MERW the def- 
inition (|23p is very similar to counting paths. Even for 
relatively small T, it produces results very close to the 
stationary probability distribution. This is expected for 
MERW, since its exponential relaxation to the stationary 
state is on average faster than GRW's for the benchmark 
graphs used. To much extent we have accounted for that 
behavior, as MERW seems to relax very fast within con- 
nected regions (proven for Cayley trees [§,[loj]), although 
it takes long time to relax between two identical con- 
nected regions [TTj . 



C. Centrality based on mean first-passage times, 
stationary distributions, and the principal 
eigenvector of the adjacency matrix 

As shown in Sec. IIIII there is a close analogy between 
r 2 , which uses powers of the transition matrix discussed 
above, and mean first-passage times matrix M, intro- 
duced in (|18l) . The centrality based on MFPT matrix 
is given by the inverse of Y), Mjf, where the sum rep- 
resents the average time the information needs to reach 
the final vertex / from anywhere in the graph. This def- 
inition is called Markov centrality in [22[, although the 
Markov process assumed there is GRW. For MERW, as 
demonstrated in FigfTJ the information extracted from 
MFPT matrix and the stationary distribution is largely 
equivalent. 

Clearly, the multiplication by D causes the general 
trend Mif ^ ttJ 1 . Since the stationary state of MERW is 
typically distributed over a wide range of values (even on 
almost regular graphs [7(]), MFPTs correlate with it very 
strongly and extend much further, especially on bounded- 
degree graphs, than for GRW, whose stationary distri- 
bution is proportional to vertex degrees. In FigfT] the 
dependence is compared between MERW and GRW. 

This observation begs the question: the stationary dis- 
tribution of which random walk should be chosen to de- 
fine a centrality measure? Indeed, the one used most 
widely is GRW, whose stationary state is produced by the 
simplest version of the prominent PageRank Q . The two 
random walk centralities have already been compared in 
[l6[ and the conclusion was among others that MERW 
has " a larger discriminating power between the best and 
worst pages," and is sensitive to link farms. 

However, the connection to other methods has been 
missing. Both paths' statistics and random walks are 
linked to the idea of calculating centrality as an eigen- 
vector associated with the largest eigenvalue of the ad- 
jacency matrix, which is a concept as old as the eco- 
nomic and sociological papers from 1965 [33] and 1972 
[34| . In the latter, this centrality was derived from the 
assumption that It = A t e/X t is the t-th order popular- 
ity measure, and that an objective measure should be 
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FIG. 1. The stationary probability distribution of MERW 
and GRW as a function of the averaged rows of MFPT matrix 
(MFPT of hitting a vertex / averaged over all initial vertices) 
for a sample graph with N = 1000 vertices. Solid lines have 
best fit slopes -1.027 ± 0.001 (MERW) and -1.070 ± 0.005 
(GRW). The correlation for GRW is weaker as the degrees of 
the graph take values 10 — 50 and accordingly the stationary 
state is quantized, ir/ for GRW is multiplied by 10 for clarity. 

taken in t = oo, convergence thus requiring the factor 
Xq 1 . Clearly, this formula is simply the canonical en- 
semble version of the one based on paths ([20]). and the 
proposed eigenvector centrality is the square root of the 
stationary state of MERW . This is also reminiscent 
of the HITS algorithm [3J] , which nevertheless for directed 
graphs uses eigenvectors of A T A. 

We note that just as centrality may be defined with the 
use of the principal eigenvector of the adjacency matrix or 
the stochastic matrix (the stationary state vector), there 
is a family of community detection methods analyzing 
the rest of the eigenvectors (often it is the spectrum of 
Laplacian that is analyzed) In fact, each of the methods 
of assessing centrality mentioned above has a number of 
counterparts that in a similar manner try to find the 
community structure of a network. In [HJ , a comparison 
is made between GRW and MERW in performance of 
some community finding methods based on the concepts 
presented above. 

V. COMPARISON 

We check the affinity of different centrality measures 
described above (together with the closeness and be- 
tweenness centrality given for reference; see [35[) by com- 
paring the result they produce for a sample of graphs. 
For a given graph, each centrality measure produces a 
vector I, whose consecutive elements are centrality val- 
ues of the corresponding nodes. To compare results of a 
pair of methods on that graph we take the correspond- 



6 



ing pair of vectors and we measure the root mean square 
distance between them (cosine or Pearson correlation dis- 
tance have been checked as well, and have generated simi- 
lar results). After repeating this computation for all pairs 
of centrality measures we obtain a square matrix. Since 
each graph from the sample produces one such matrix, 
we take the average (entrywise) over the whole sample. 
The entries of the resultant matrix represent the average 
distance between a pair of centrality measures. Finally, 
this distance matrix is used as input for an agglomer- 
ative clustering algorithm with average weights, which 
generates the dendrograms in FigJ5] The heights of their 
branches correspond to the distances between pairs of 
clusters. The maximum standard deviation of the dis- 
tance matrix entries is smaller than 0.61%, hence the 
results of the clustering algorithm should be correct for 
most graphs in the sample. 

For example, in the dendrogram on the left in FigJ3J 
the nearest centralities are 1 and 2 (which denote the sta- 
tionary state of MERW and its MFPT centrality), and 
they were the first to be clustered together. Next, meth- 
ods 5 and 7 were clustered (the shortest paths' centrality 
and the MFPT centrality of GRW), and so on. It can be 
seen that the closeness and betweenness (9 and 10) are 
always clustered at the very end, which means they are 
very distinct from the other methods. This is expected, 
as they are based on different concept of importance, 
and could for instance assign a high centrality score to 
a node near a bottle-neck, even though it is poorly con- 
nected. The methods 4 and 5 depend on the maximum 
path length T taken into account (it is set to the diameter 
of the graph, which varies between 4-10), so their assign- 
ment might be different for parameter choices other than 
shown here. 

More importantly, another look at the dendrograms re- 
veals that when the parameters of the benchmark graphs 
change there are two groups of methods that do not mix 
with each other. One includes centralities derived from 
MERW (1,2, and 3: centralities based on the stationary 
state, MFPT, and P*, respectively) and 4 which is based 
on weighted paths, while the other includes centralities 
derived from GRW (6, 7, and 8, as for MERW) and 5 
which is based on shortest paths. Thus, methods uti- 
lizing GRW are close to each other, however for graphs 
with easily distinguishable communities they can cluster 
together with other centrality measures. The methods 
utilizing MERW are connected to path enumeration, as 
predicted in Sec. IIV Al and they never intermingle with 
the other centrality measures. The average distance of 
this whole group from other methods analyzed is greater 
than the analogous distance for the group of GRW meth- 
ods, whereas the average distance between the members 
of this group is smaller than the corresponding value for 
GRW methods. This means that indeed the centralities 
defined by MERW comprise a distinct, close-knit family, 
and produce equivalent results. 



A. Benchmarks 

In the analysis in the previous section LFR benchmark 
graphs [36[ were utilized. Since they were designed to 
benchmark community finding algorithms, they contain 
communities with preset size distribution, constructed 
with the use of the planted partition model. Although the 
graphs are locally random, they model a range of possible 
real world structures, and so they serve our purpose in 
testing centrality measures. 

We follow the notation used by the authors of the 
benchmarks. Thus, by y, we denote the mixing parame- 
ter (this should not be confused with the usage of fi in 
(f2"U|). where chemical potential is meant; context should 
make it unambiguous, which meaning is intended), which 
is the fraction of links a given node shares with the nodes 
outside its community. The parameter is approximately 
equal for all nodes in a graph. For chosen values of we 
take 100 graphs with N = 200 vertices; their exponents 
for the degree and community size distributions are re- 
spectively T\ — — 2 and Ti = — 1. The average and max- 
imum degrees are 10 and 30, and the community sizes 
range 5 — 35. 
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FIG. 2. The dendrograms correspond to benchmark graphs 
with the mixing parameter u — 0.1,0.3,0.6 (from left to 
right). 1, 2, and 3 represent MERW's stationary state, 
MFPT, and P*. 6, 7, and 8 are GRW's analogues. 4 de- 
notes weighted paths', and 5 shortest paths' centrality. 9, 10 
are closeness and betweenness. In 3 and 8, the maximal power 
of P is T = 5. In 4 and 5 u — In Ao and T equals the diameter 
of the graph. 



VI. CONCLUSIONS 

In this paper it has been shown that the random walk 
distance matrix r 2 (t) defined in (|14p, when modified to 
account for walks of all lengths, is equivalent to a sym- 
metric version of the mean- first passage matrix M (|18l) . 
where the fundamental matrix Z is substituted with the 
propagator G. 

This observation also leads to the conclusion that a 
number of known centrality measures are nearly identical 
if the random walk under consideration is the maximal- 
entropy random walk. This common perspective includes 
measures related to the properties of graphs: the eigen- 
vector centrality and centrality based on enumeration of 
weighted paths, and those related to random walks: their 
stationary state, powers of their transition matrix, and 
finally their MFPT matrix, as reviewed in Sec. IIV1 
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A numerical investigation on a set of benchmark 
graphs confirms this thesis, showing that there is a group 
of centrality measures related to GRW that tend to pro- 
duce similar results, and an even more homogeneous and 
distinct gr oup of centralities related to MERW. To quote 
Bonacich [34j :" Three different approaches to calculating 
popularity scores have almost the same solution [...]. This 
is an economy; three approaches are reduced to just one. 
This is the main point of the paper" . 
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